Process Trace Clustering: A Heterogeneous Information Network Approach

نویسندگان

  • Phuong Nguyen
  • Aleksander Slominski
  • Vinod Muthusamy
  • Vatche Ishakian
  • Klara Nahrstedt
چکیده

Process mining is the task of extracting information from event logs, such as ones generated from workflow management or enterprise resource planning systems, in order to discover models of the underlying processes, organizations, and products. As the event logs often contain a variety of process executions, the discovered models can be complex and difficult to comprehend. Trace clustering helps solve this problem by splitting the event logs into smaller subsets and applying process discovery algorithms on each subset, resulting in per-subset discovered processes that are less complex and more accurate. However, the state-of-the-art clustering techniques are limited: the similarity measures are not process-aware and they do not scale well to high-dimensional event logs. In this paper, we propose a conceptualization of process’s event logs as a heterogeneous information network, in order to capture the rich semantic meaning, and thereby derive better process-specific features. In addition, we propose SeqPathSim, a meta path-based similarity measure that considers node sequences in the heterogeneous graph and results in better clustering. We also introduce a new dimension reduction method that combines event similarity with regularization by process model structure to deal with event logs of high dimensionality. The experimental results show that our proposed approach outperforms state-of-the-art trace clustering approaches in both accuracy and structural complexity metrics.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Improving Accuracy in Intrusion Detection Systems Using Classifier Ensemble and Clustering

Recently by developing the technology, the number of network-based servicesis increasing, and sensitive information of users is shared through the Internet.Accordingly, large-scale malicious attacks on computer networks could causesevere disruption to network services so cybersecurity turns to a major concern fornetworks. An intrusion detection system (IDS) could be cons...

متن کامل

Clustering Traces Using Sequence Alignment

Process mining discovers process models from even logs. Logs containing heterogeneous sets of traces can lead to complex process models that try to account for very different behaviour in a single model. Trace clustering identifies homogeneous sets of traces within a heterogeneous log and allows for the discovery of multiple, simpler process models. In this paper, we present a trace clustering ...

متن کامل

A Hybrid Time Series Clustering Method Based on Fuzzy C-Means Algorithm: An Agreement Based Clustering Approach

In recent years, the advancement of information gathering technologies such as GPS and GSM networks have led to huge complex datasets such as time series and trajectories. As a result it is essential to use appropriate methods to analyze the produced large raw datasets. Extracting useful information from large data sets has always been one of the most important challenges in different sciences,...

متن کامل

Improving Lifetime of Strategic Information Network in Oil Supply Chain

Today, information networks play an important role in supply chain management. Therefore, in this article, clustering-based routing protocols, which are one of the most important ways to reduce energy consumption in wireless sensor networks, are used to optimize the supply chain informational cloud network. Accordingly, first, a clustering protocol is presented using self-organizing map neu...

متن کامل

Explaining the Role of Management Accounting Information System in Strategy Formulation with Actors Network Approach

The real challenge of business environment is derived from a situation where organizations need to find opportunities on how to introduce ideas and new products to market that provide future earnings stream. Management accounting is used as a tool in this process and provides information on opportunities and threats. The purpose of this research is to explain the role of management accounting i...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2016